=begin pod :tag<perl6>

=TITLE Syntax

=SUBTITLE General rules of Perl 6 syntax

Perl 6 borrows many concepts from human language. Which is not surprising,
considering it was designed by a linguist.

It reuses common elements in different contexts, has the notion of nouns
(terms) and verbs (operators), is context-sensitive (in the every day sense, not
necessarily in the Computer Science interpretation), so a symbol can have a
different meaning depending on whether a noun or a verb is expected.

It is also self-clocking, so that the parser can detect most of the common
errors and give good error messages.

=head1 Lexical Conventions

Perl 6 code is Unicode text, current implementations support UTF-8 as the
input encoding.

See also L<Unicode versus ASCII symbols|/language/unicode_ascii>.

=head2 Free Form

It is free-form, in the sense that you are mostly free to
chose the amount of whitespace you chose, though in some cases, the presence
or absence of whitespace carries meaning.

So you can write

=begin code
if True {
    say "Hello";
}
=end code

or

=begin code
    if True {
say "Hello";
        }
=end code

or

=begin code
if True { say "Hello" }
=end code

or even

=begin code
if True {say "Hello"}
=end code

though you can't leave out any of the remaining whitespace.

=head2 X<Unspace|syntax,\>
X<|syntax,Unspace>

In many places where the compiler would not allow a space you can use any
whitespace that is quoted with a backslash. Unspaces in tokens are not
supported. Newlines that are unspaced still count when the compiler produces
line numbers. Use cases for unspace are separation of postfix operators and
routine argument lists.

    sub alignment(+@l) { +@l };
    sub long-name-alignment(+@l) { +@l };
    alignment\         (1,2,3,4).say;
    long-name-alignment(3,5)\   .say;
    say Inf+Inf\i;

=head2 Separating Statements with Semicolons

A Perl 6 program is a list of statements, separated by semicolons C<;>.

=begin code
say "Hello";
say "world";
=end code

A semicolon after the final statement (or after the final statement inside a
block) is optional.

=begin code
say "Hello";
say "world"
=end code

=begin code
if True {
    say "Hello"
}
say "world"
=end code

=head2 Implied Separator Rule (for statements ending in blocks)

Complete statements ending in bare blocks can omit the trailing semicolon, if no
additional statements on the same line follow the block's closing curly
brace C<}>. This is called the "implied separator rule". For example, you
don't need to write a semicolon after an C<if> statement block as seen above, and
below.

=begin code
if True { say "Hello" }
say "world";
=end code

However, semicolons are required to separate a block from trailing statements in
the same line.

=begin code
if True { say "Hello" }; say "world";
#                     ^^^ this ; is required
=end code

This implied statement separator rule applies in other ways, besides control
statements, that could end with a bare block. For example, in combination with
the colon C<:> syntax for method calls.

=begin code
my @names = <Foo Bar Baz>;
my @upper-case-names = @names.map: { .uc }    # OUTPUT: [FOO BAR BAZ]
=end code

For a series of blocks that are part of the same C<if>/C<elsif>/C<else> (or similar)
construct, the implied separator rule only applies at the end of the last block of that series.
These three are equivalent:

=begin code
if True { say "Hello" } else { say "Goodbye" }; say "world";
#                                            ^^^ this ; is required
=end code

=begin code
if True { say "Hello" } else { say "Goodbye" } # <- implied statement separator
say "world";
=end code

=begin code
if True { say "Hello" }   # still in the middle of an if/else statement
else    { say "Goodbye" } # <- no semicolon required because it ends in a block
                          #    without trailing statements in the same line
say "world";
=end code

=head2 Comments

Comments are parts of the program text which are only intended for human readers;
the Perl 6 compilers do not evaluate them as program text.

Comments count as whitespace in places where the absence or presence of
whitespace disambiguates possible parses.

=head3 Single-line comments

The most common form of comments in Perl 6 starts with a single hash character
C<#> and goes until the end of the line.

=begin code :skip-test
if $age > 250 {     # catch obvious outliers
    # this is another comment!
    die "That doesn't look right"
}
=end code

=head3 Multi-line / embedded comments

Multi-line and embedded comments start with a hash character, followed by a
backtick, and then some opening bracketing character, and end with the matching
closing bracketing character. The content can not only span multiple lines, but
can also be embedded inline.

=begin code
if #`( why would I ever write an inline comment here? ) True {
    say "something stupid";
}
=end code

Brackets inside the comment can be nested, so in C<#`{ a { b } c }>, the
comment goes until the very end of the string. You may also use more complex
brackets, such as C<#`{{ double-curly-brace }}>, which might help disambiguate
from nested brackets.

=head3 Pod comments

Pod syntax can be used for multi-line comments

=begin code
say "this is code";

=begin comment

Here are several
lines
of comment

=end comment

say 'code again';
=end code

=head2 Identifiers

Identifiers are a grammatical building block that occur in several places. An
identifier is a primitive name, and must start with an alphabetic character
(or an underscore), followed by zero or more word characters (alphabetic,
underscore or number). You can also embed dashes C<-> or single quotes C<'>
in the middle, but not two in a row, and only if followed immediately by an
alphabetic character.

    =begin code :skip-test
    # valid identifiers:
    x
    something-longer
    with-numbers1234
    don't
    fish-food
    π
    =end code

    =begin code :skip-test
    # not valid identifiers:
    with-numbers1234-5
    42
    is-prime?
    fish--food
    =end code

Names of constants, types (including classes and modules) and routines (subs
and methods) are identifiers, and they also appear in variable names (usually
proceeded by a sigil; see L<variables|/language/variables> for more details.)

Namespaces are provided by L<packages|/language/packages>. By separating
identifiers with double colons, the right most name is inserted into existing
or automatically created packages.

    my Int $Foo::Bar::buzz = 42;
    say $Foo::Bar::buzz; # OUTPUT: «42␤»

Identifiers can contain colon pairs. The entire colon pair becomes part of the
name of the identifier.

    my $foo:bar = 1;
    my $foo:bar<2> = 2;
    say MY::.keys;
    # OUTPUT: «($=pod !UNIT_MARKER EXPORT $_ $! ::?PACKAGE
    #          GLOBALish $¢ $=finish $/ $foo:bar $foo:bar<2>
    #          $?PACKAGE)␤»

Colon pairs in identifiers support interpolation. Please note that resolution
of names often happens at compile time, so interpolation values must be known
at compile time.

    constant $c = 42;
    my $a:foo<42> = "answer";
    say $a:foo«$c»;
    # OUTPUT: «answer␤»

Unicode superscript numerals are exponents and not part of an identifier;
e.g. C<$x²> does the square of variable C<$x>.  Subscript numerals are TBD.

=head2 Term term:<>

You can use C«term:<>» to introduce new terms.

This is handy for introducing constants that defy the rules of normal identifiers:

    use Test; plan 1; constant &term:<👍> = &ok.assuming(True);
    👍
    # OUTPUT: «1..1␤ok 1 - ␤»

But terms don't have to be constant: you can also use them for functions that
don't take any arguments, and force the parser to expect an operator after them.

For instance:

    sub term:<dice> { (1..6).pick };
    say dice + dice;

can print any number between 1 and 12.

If instead we had declared `dice` as a regular `sub dice() {
(1...6).pick }`, the expression `dice + dice` would be parsed as
`dice(+(dice()))`, resulting in an error because `sub dice` expects zero
arguments.

=head1 Statements and Expressions

Perl 6 programs are made of lists of statements. A special case of a statement
is an I<expression>, which returns a value. For example C<if True { say 42 }>
is syntactically a statement, but not an expression, whereas C<1 + 2> is an
expression (and thus also a statement).

The C<do> prefix turns statements into expressions. So while

    =begin code :skip-test
    my $x = if True { 42 };     # Syntax error!
    =end code

is an error,

   my $x = do if True { 42 };

assigns the return value of the if statement (here C<42>) to the variable
C<$x>.


=head1 Terms

Terms are the basic nouns that, optionally together with operators, can form
expressions. Examples for terms are variables (C<$x>), barewords such as type
names (C<Int>), literals (C<42>), declarations (C<sub f() { }>) and calls
(C<f()>).

For example, in the expression C<2 * $salary>, C<2> and C<$salary> are two
terms (an L<integer|/type/Int> literal and a L<variable|/language/variables>).

=head2 Variables

Variables typically start with a special character called the I<sigil>, and
are followed by an identifier. Variables must be declared before you can use
them.

    # declaration:
    my $number = 21;
    # usage:
    say $number * 2;

See the L<documentation on variables|/language/variables> for more details.


=head2 Barewords (Constants, Type Names)

Pre-declared identifiers can be terms on their own. Those are typically type
names or constants, but also the term C<self> which refers to an object that
a method was called on (see L<objects|/language/objects>), and sigilless
variables:

    say Int;                # OUTPUT: «(Int)␤»
    #   ^^^ type name (built in)

    constant answer = 42;
    say answer;
    #   ^^^^^^ constant

    class Foo {
        method type-name {
            self.^name;
          # ^^^^ built-in term 'self'
        }
    }
    say Foo.type-name;     # OUTPUT: «Foo␤»
    #   ^^^ type name


=head2 Packages and Qualified Names

Named entities, such as variables, constants, classes, modules, subs, etc, are
part of a namespace. Nested parts of a name use C<::> to separate the
hierarchy. Some examples:

    =begin code :skip-test
    $foo                # simple identifiers
    $Foo::Bar::baz      # compound identifiers separated by ::
    $Foo::($bar)::baz   # compound identifiers that perform interpolations
    Foo::Bar::bob(23)   # function invocation given qualified name
    =end code

See the L<documentation on packages|/language/packages> for more details.


=head2 Literals

A L<literal|https://en.wikipedia.org/wiki/Literal_%28computer_programming%29>
is a representation of a constant value in source code. Perl 6 has literals
for several built-in types, like L<strings|/type/Str>, several numeric types,
L<pairs|/type/Pair> and more.


=head3 String literals

String literals are surrounded by quotes:

    say 'a string literal';
    say "a string literal\nthat interprets escape sequences";

See L<quoting|/language/quoting> for many more options.

=head3 X<Number literals|0b (radix form);0d (radix form);0o (radix form);0x (radix form)>

Number literals are generally specified in base ten, unless a prefix like
C<0x> (heB<x>adecimal, base 16), C<0o> (B<o>ctal, base 8) or C<0b>
(B<b>inary, base 2) or an explicit base in adverbial notation like
C<< :16<A0> >> specifies it otherwise. Unlike other programming languages,
leading zeros do I<not> indicate base 8; instead a compile-time warning is
issued.

In all literal formats, you can use underscores to group digits;
they don't carry any semantic information; the following literals all evaluate to the same
number:

    =begin code :skip-test
    1000000
    1_000_000
    10_00000
    100_00_00
    =end code

=head4 Int literals

Integers default to signed base-10, but you can use other bases. For details,
see L<Int|/type/Int>.

    =begin code :skip-test
    -2          # actually not a literal, but unary - operator applied to numeric literal 2
    12345
    0xBEEF      # base 16
    0o755       # base 8
    :3<1201>    # arbitrary base, here base 3
    =end code

=head4 Rat literals

L<Rat|/type/Rat> literals (rationals) are very common, and take the place of decimals or floats in many other languages. Integer division also results in a C<Rat>.

    =begin code :skip-test
    1.0
    3.14159
    -2.5        # Not actually a literal, but still a Rat
    :3<21.0012> # Base 3 rational
    ⅔
    2/3         # Not actually a literal, but still a Rat
    =end code

=head4 Num literals

Scientific notation with an integer exponent to base ten after an C<e> produces
L<floating point number|/type/Num>:

    =begin code :skip-test
    1e0
    6.022e23
    1e-9
    -2e48
    2e2.5       # error
    =end code

=head4 Complex literals

L<Complex|/type/Complex> numbers are written either as an imaginary number
(which is just a rational number with postfix C<i> appended), or as a sum of
a real and an imaginary number:

    =begin code :skip-test
    1+2i
    6.123e5i    # note that this is 6.123e5 * i and not 6.123 * 10 ** (5i)
    =end code

=head3 Pair literals

L<Pairs|/type/Pair> are made of a key and a value, and there are two basic forms for
constructing them: C<< key => 'value' >> and C<:key('value')>.

=head4 Arrow pairs

Arrow pairs can have an expression or an identifier on the left-hand side:

    =begin code :skip-test
    identifier => 42
    "identifier" => 42
    ('a' ~ 'b') => 1
    =end code

=head4 Adverbial pairs (colon pairs)

Short forms without explicit values:

    =begin code
    my $thing = 42;
    :$thing                 # same as  thing => $thing
    :thing                  # same as  thing => True
    :!thing                 # same as  thing => False
    =end code

The variable form also works with other sigils, like C<:&callback> or
C<:@elements>.

Long forms with explicit values:

    =begin code :skip-test
    :thing($value)              # same as  thing => $value
    :thing<quoted list>         # same as  thing => <quoted list>
    :thing['some', 'values']    # same as  thing => ['some', 'values']
    :thing{a => 'b'}            # same as  thing => { a => 'b' }
    =end code

=head3 Array literals

A pair of square brackets can surround an expression to form an itemized
L<Array|/type/Array> literal; typically there is a comma-delimited list
inside:

    say ['a', 'b', 42].join(' ');   # OUTPUT: «a b 42␤»
    #   ^^^^^^^^^^^^^^ Array constructor

If the constructor is given a single L<Iterable>, it'll clone and flatten it.
If you want an C<Array> with just 1 element that is that C<Iterable>, ensure
to use a comma after it:

    my @a = 1, 2;
    say [@a].perl;  # OUTPUT: «[1, 2]␤»
    say [@a,].perl; # OUTPUT: «[[1, 2],]␤»

The C<Array> constructor does not flatten other types of contents.
Use the L<Slip> prefix operator (C<|>) to flatten the needed items:

    my @a = 1, 2;
    say [@a, 3, 4].perl;  # OUTPUT: «[[1, 2], 3, 4]␤»
    say [|@a, 3, 4].perl; # OUTPUT: «[1, 2, 3, 4]␤»

=head3 Hash literals

A leading associative sigil and pair of parenthesis C<%( )> can surround a C<List> of
C<Pairs> to form a L<Hash|/type/Hash> literal; typically there is a comma-delimited
C<List> of C<Pairs> inside. If a non-pair is used, it is assumed to be a key and
the next element is the value. Most often this is used with simple arrow pairs.

    say %( a => 3, b => 23, :foo, :dog<cat>, "french", "fries" );
    # OUTPUT: «a => 3, b => 23, dog => cat, foo => True, french => fries␤»

    say %(a => 73, foo => "fish").keys.join(" ");   # OUTPUT: «a foo␤»
    #   ^^^^^^^^^^^^^^^^^^^^^^^^^ Hash constructor

When assigning to a C<%> sigiled variable on the LHS, the sigil and parenthesis
surrounding the RHS C<Pairs> are optional.

    my %ages = fred => 23, jean => 87, ann => 4;

By default, keys in C<%( )> are forced to strings. To compose a hash with
non-string keys, use curly brace delimiters with a colon prefix C<:{ }> :

    my $when = :{ (now) => "Instant", (DateTime.now) => "DateTime" };

Note that with objects as keys, you cannot access non-string keys as strings:

    say :{ -1 => 41, 0 => 42, 1 => 43 }<0>;  # OUTPUT: «(Any)␤»
    say :{ -1 => 41, 0 => 42, 1 => 43 }{0};  # OUTPUT: «42␤»

=head3 Regex literals

A L<Regex|/type/Regex> is declared with slashes like C</foo/>. Note that this C<//> syntax is shorthand for the full C<rx//> syntax.

    =begin code :skip-test
    /foo/          # Short version
    rx/foo/        # Longer version
    Q :regex /foo/ # Even longer version

    my $r = /foo/; # Regexes can be assigned to variables
    =end code

=head3 Signature literals

Signatures can be used standalone for pattern matching, in addition to the
typical usage in sub and block declarations. A standalone signature is declared
starting with a colon:

    say "match!" if 5, "fish" ~~ :(Int, Str); # OUTPUT: «match!␤»

    my $sig = :(Int $a, Str);
    say "match!" if (5, "fish") ~~ $sig; # OUTPUT: «match!␤»

    given "foo", 42 {
      when :(Str, Str) { "This won't match" }
      when :(Str, Int $n where $n > 20) { "This will!" }
    }

See the L<Signatures|/type/Signature> documentation for more about signatures.

=head2 Declarations

=head3 Variable declaration

    my $x;                          # simple lexical variable
    my $x = 7;                      # initialize the variable
    my Int $x = 7;                  # declare the type
    my Int:D $x = 7;                # specify that the value must be defined (not undef)
    my Int $x where { $_ > 3 } = 7; # constrain the value based on a function
    my Int $x where * > 3 = 7;      # same constraint, but using L<Whatever> short-hand

See L<Variable Declarators and
Scope|/language/variables#Variable_Declarators_and_Scope>
for more details on other scopes (our, has).

=head3 Subroutine declaration

    # The signature is optional
    sub foo { say "Hello!" }

    sub say-hello($to-whom) { say "Hello $to-whom!" }

You can also assign subroutines to variables.

    my &f = sub { say "Hello!" } # Un-named sub
    my &f = -> { say "Hello!" }  # Lambda style syntax. The & sigil indicates the variable holds a function
    my $f = -> { say "Hello!" }  # Functions can also be put into scalars

=head3 Module, Class, Role, and Grammar declaration

There are several types of compilation units (packages), each declared with a
keyword, a name, some optional traits, and a body of functions.

    module Gar { }

    class Foo { }

    role Bar { }

    grammar Baz { }

You can declare a unit of things without explicit curly brackets.

    =begin code :skip-test
    unit module Gar;
    # ... stuff goes here instead of in {}'s
    =end code

=head3 Multi-dispatch declaration

See also L<Multi-dispatch|/language/functions#Multi-dispatch>.

Subroutines can be declared with multiple signatures.

    multi sub foo() { say "Hello!" }
    multi sub foo($name) { say "Hello $name!" }

Inside of a class, you can also declare multi-dispatch methods.

    multi method greet { }
    multi method greet(Str $name) { }

=head1 Subroutine calls

Subroutines are created with the keyword C<sub> followed
by an optional name, an optional signature and a code block.
Subroutines are lexically scoped, so if a name is specified
at the declaration time, the same name can be used
in the lexical scope to invoke the subroutine.
A subroutine is an instance of type L<Sub|/type/Sub> and
can be assigned to any container.

=comment TODO

    =begin code :skip-test
    foo;   # Invoke the function foo with no arguments
    foo(); # Invoke the function foo with no arguments
    &f();  # Invoke &f, which contains a function
    &f.(); # Same as above, needed to make the following work
    my @functions = ({say 1}, {say 2}, {say 3});
    @functions>>.(); # hyper method call operator
    =end code

When declared within a class, a subroutine is named "method": methods
are subroutines invoked against an object (i.e., a class instance).
Within a method the special variable C<self> contains the object instance
(see L<Methods|/language/classtut#Methods>).

    =begin code :skip-test
    # Method invocation. Object (instance) is $person, method is set-name-age
    $person.set-name-age('jane', 98);   # Most common way
    $person.set-name-age: 'jane', 98;   # Precedence drop
    set-name-age($person: 'jane', 98);  # Invocant marker
    =end code

For more information see L<functions|/language/functions>.

=head2 Precedence Drop

In the case of method invocation (i.e., when invoking a subroutine against
a class instance) it is possible to apply the C<precedence drop>, identified
by a colon C<:> just after the method name and before the argument list.
The argument list takes precedence over the method call, that on the other
hand "drops" its precedence. In order to better understand consider
the following simple example (extra spaces have been added just to align
method calls):

    =begin code :skip-test
    my $band = 'Foo Fighters';
    say $band.substr( 0, 3 ) .substr( 0, 1 ); # F
    say $band.substr: 0, 3   .substr( 0, 1 ); # Foo
    =end code

In the second method call the rightmost C<substr> is applied to "3" and not
to the result of the leftmost C<substr>, which on the other hand yields precedence
to the rightmost one.

=head1 Operators

See L<Operators|/language/operators> for lots of details.

Operators are functions with a more symbol heavy and composable syntax. Like
other functions, operators can be multi-dispatch to allow for context-specific
usage.

There are five types (arrangements) for operators, each taking either one or two arguments.

    =begin code :skip-test
    ++$x           # prefix, operator comes before single input
    5 + 3          # infix, operator is between two inputs
    $x++           # postfix, operator is after single input
    <the blue sky> # circumfix, operator surrounds single input
    %foo<bar>      # postcircumfix, operator comes after first input and surrounds second
    =end code

=head2 Meta Operators

Operators can be composed. A common example of this is combining an infix
(binary) operator with assignment. You can combine assignment with any binary
operator.

    =begin code :skip-test
    $x += 5     # Adds 5 to $x, same as $x = $x + 5
    $x min= 3   # Sets $x to the smaller of $x and 3, same as $x = $x min 3
    $x .= child # Equivalent to $x = $x.child
    =end code

Wrap an infix operator in C<[ ]> to create a new reduction operator that works
on a single list of inputs, resulting in a single value.

    say [+] <1 2 3 4 5>;    # OUTPUT: «15␤»
    (((1 + 2) + 3) + 4) + 5 # equivalent expanded version

Wrap an infix operator in C<« »> (or the ASCII equivalent C<<< >>>) to create a
new hyper operator that works pairwise on two lists.

    say <1 2 3> «+» <4 5 6> # OUTPUT: «(5 7 9)␤»

The direction of the arrows indicates what to do when the lists are not the same size.

    =begin code :skip-test
    @a «+« @b # Result is the size of @b, elements from @a will be re-used
    @a »+» @b # Result is the size of @a, elements from @b will be re-used
    @a «+» @b # Result is the size of the biggest input, the smaller one is re-used
    @a »+« @b # Exception if @a and @b are different sizes
    =end code

You can also wrap a unary operator with a hyper operator.

    say -« <1 2 3> # OUTPUT: «(-1 -2 -3)␤»

=end pod

# vim: expandtab softtabstop=4 shiftwidth=4 ft=perl6
